Search CORE

6 research outputs found

AsterixDB: A Scalable, Open Source BDMS

Author: Alsubaiee Sattam
Altowim Yasser
Altwaijry Hotham
Behm Alexander
Borkar Vinayak
Bu Yingyi
Carey Michael
Cetindil Inci
Cheelangi Madhusudan
Faraaz Khurram
Gabrielova Eugenia
Grover Raman
Heilbron Zachary
Kim Young-Seok
Li Chen
Li Guangqiang
Ok Ji Mahn
Onose Nicola
Pirzadeh Pouria
Tsotras Vassilis
Vernica Rares
Wen Jian
Westmann Till
Publication venue
Publication date: 02/07/2014
Field of study

AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make it well-suited to applications like web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries; a scalable runtime; partitioned, LSM-based data storage and indexing (including B+-tree, R-tree, and text indexes); support for external as well as natively stored data; a rich set of built-in types; support for fuzzy, spatial, and temporal types and queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store. Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This paper is the first complete description of the resulting open source AsterixDB system. Covered herein are the system's data model, its query language, and its software architecture. Also included are a summary of the current status of the project and a first glimpse into how AsterixDB performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform, for things that both technologies can do. Also included is a brief description of some initial trials that the system has undergone and the lessons learned (and plans laid) based on those early "customer" engagements

arXiv.org e-Print Archive

CiteSeerX

On the Performance Evaluation of Big Data Systems

Author: Pirzadeh Pouria
Publication venue: eScholarship, University of California
Publication date: 01/01/2015
Field of study

Big Data is turning to be a key basis for the competition and growth among various businesses. The emerging need to store and process huge volumes of data has resulted in the appearance of different Big Data serving systems with fundamental differences. Big Data benchmarking is a means to assist users to pick the correct system to fulfill their applications' needs. It can also help the developers of these systems to make the correct decisions in building and extending them. While there have been several major efforts in benchmarking Big Data systems, there are still a number of challenges and unmet needs in this area. This dissertation is aimed at contributing to the performance evaluation of Big Data systems from two major aspects. First, it uses both new and existing benchmarks to do deep performance analysis of two major classes of modern Big Data systems, namely NoSQL request-serving systems and Big Data analytics platforms. As its second contribution, this dissertation looks at Big Data benchmarking from a new angle by comparing Big Data systems with respect to their features and functionality. This effort is specifically important in the context of increasing interest in using a unified system which is rich in functionality to serve different types of workloads, especially as the Big Data applications evolve and become more complex

Ezid

eScholarship - University of California

Performance Evaluation of Range Queries in Key Value Stores

Author: BF Cooper
Hakan Hacıgümüş
J Gray
Junichi Tatemura
Oliver Po
PL Lehman
Pouria Pirzadeh
R Cattell
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref